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Abstract 

DNA:RNA hybrid formation is emerging as a significant cause of genome instability in biological systems ranging from 
bacteria to mammals. Here we describe the genome-wide distribution of DNA:RNA hybrid prone loci in Saccharomyces 
cerevisiae by DNA:RNA immunoprecipitation (DRIP) followed by hybridization on tiling microarray. These profiles show that 
DNA:RNA hybrids preferentially accumulated at rDNA, Tyl and Ty2 transposons, telomeric repeat regions and a subset of 
open reading frames (ORFs). The latter are generally highly transcribed and have high GC content. Interestingly, significant 
DNA:RNA hybrid enrichment was also detected at genes associated with antisense transcripts. The expression of antisense- 
associated genes was also significantly altered upon overexpression of RNase H, which degrades the RNA in hybrids. Finally, 
we uncover mutant-specific differences in the DRIP profiles of a Senl helicase mutant, RNase H deletion mutant and Hprl 
THO complex mutant compared to wild type, suggesting different roles for these proteins in DNA:RNA hybrid biology. Our 
profiles of DNA:RNA hybrid prone loci provide a resource for understanding the properties of hybrid-forming regions in vivo, 
extend our knowledge of hybrid-mitigating enzymes, and contribute to models of antisense-mediated gene regulation. A 
summary of this paper was presented at the 26* International Conference on Yeast Genetics and Molecular Biology, August 
2013. 
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Introduction 

Elevated DNAiRNA hybrid formation due to defects in RNA 
processing pathways leads to genome instabUity and replication 
stress across species [1-7]. R loops threaten genome stability and 
often form under abnormal conditions where nascent mRNA is 
improperly processed or RNA half-life is increased, resulting in 
RNA that can hybridize with template DNA, displacing the non- 
transcribed DNA strand [8] . A recent study also found that hybrid 
formation can occur in trans via Rad5 1 -mediated DNA- RNA 
strand exchange [9]. Persistent R loops pose a major threat to 
genome stability through two mechanisms. First, the exposed non- 
transcribed strand is susceptible to endogenous DNA damage due 
to the increased exposure of chemically reactive groups. The 
second, more widespread mechanism, identified in Escherichia coli, 
Saccharomyces cerevisiae, Caenorhabditis elegans and human cells, 
involves the R loops and associated stalled transcription complex- 
es, which block DNA replication fork progression [3,4,8,10,1 1]. R 
loop-mediated instability is an area of great interest primarily 



because genome instability is considered an enabling characteristic 
of tumor formation [12]. Moreover, mutations in RNA splicing/ 
processing factors are frequendy found in human cancer, heritable 
diseases like Aicardi-Goutieres syndrome, and a degenerative 
ataxia associated with Senataxin mutations [13-17]. 

To avoid the deleterious effects of R loops, cells express enzymes 
for the removal of abnormally formed DNAiRNA hybrids. In S. 
cerevisiae, KNHl and RNH201, each encoding RNase H are 
responsible for one of the best characterized mechanisms for 
reducing R loop formation by enzymatically degrading the RNA 
in DNAiRNA hybrids [8]. Another extensively studied anti-hybrid 
factor is the THO/TREX complex which functions to suppress 
hybrid formation at the level of transcription termination and 
mRNA packaging [4,11,18,19]. In addition, the Senataxin 
helicase, yeast Senl, plays an important role in facilitating 
replication fork progress through transcribed regions and unwind- 
ing RNA in hybrids to mitigate R loop formation and RNA 
polymerase II transcription-associated genome instability [5,20]. 
Several additional anti-hybrid mechanisms have also been 
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Author Summary 

RNA processing factors are mutated in human cancers, 
inherited developmental disorders and neurodegenerative 
syndromes. Defects in RNA processing have been associ- 
ated with increased levels of mutations and DNA damage 
in part via the formation of DNA:RNA hybrids. Although it 
is likely that specific regions of the genome are more 
prone to DNA:RNA hybrid formation, a map of hybrid- 
prone regions is not available. In this study, we describe 
the genome-wide distribution of DNA:RNA hybrids in both 
normal and mutant Saccharomyces cerevisiae cells. The 
resulting profiles contribute to both our understanding of 
the general properties of hybrid-forming loci and to our 
knowledge of hybrid-mitigating enzymes. Interestingly, 
significant DNA:RNA hybrid enrichment was detected at 
genes associated with antisense transcription. We show 
that overexpression of RNase H, which degrades the RNA 
in hybrids, significantly affects the expression of genes 
associated with antisense transcripts. These findings 
support a role for DNA:RNA hybrids in regulation of gene 
expression by antisense transcripts. 



identified including topoisomerases and other RNA processing 
factors [2,6,7,9,21-23]. 

To add to the complexity of DNAiRNA hybrid management in 
the cell, hybrids also occur naturally and have important biological 
fimctions [24]. In human cells, R loop formation facilitates 
immunoglobulin class switching, protects against DNA methyla- 
tion at CpG island promoters and plays a key role in pause site- 
dependent transcription termination [25-28]. Transcription of 
telomeres by RNA polymerase II also produces telomeric repeat- 
containing RNAs (TERRA), which associate with telomeres and 
inhibit telomere elongation in a DNAiRNA hybrid-dependent 
fashion [29-31]. Noncoding (nc)RNA such as antisense tran- 
scripts, perform a regulatory role in the expression of sense 
transcripts that may involve R loops [32]. The proposed 
mechanisms of antisense transcription regulation are not clearly 
understood and involve diflferent modes of action specific to each 
locus. Current models include chromatin modification resulting 
from antisense-associated transcription, antisense transcription 
modulation of transcription regulators, collision of sense and 
antisense transcription machineries and antisense transcripts 
expressed in trans interacting with the promoter for sense 
transcription [32-40]. More recently, studies in Arabidopsis thaliana 
found an antisense transcript that forms R loops, which can be 
differentially stabilized to modulate gene regulation [41]. Similar- 
ly, in mouse cells the stabilization of an R loop was shown to 
inhibit antisense transcription [42]. 

Here we describe, for the first time, a genome-wide profile of 
DNAiRNA hybrid prone loci in S. cerevisiae by DNA:RNA 
immunoprecipitation followed by hybridization on tiling micro- 
arrays (DRIP-chip). We found that DNA:RNA hybrids occurred 
at highly transcribed regions in wild type cells, including some 
identified in previous studies. Remarkably, we observed that 
DNA:RNA hybrids were significantly associated with genes that 
have corresponding antisense transcripts, suggesting a role for 
hybrid formation at these loci in gene regulation. Consistently, we 
found that genes whose expression was altered by overexpression 
of RNase H were also significantly associated with antisense 
transcripts. A small-scale cytological screen found that diverse 
RNA processing mutants had increased hybrid formation and 
additional DRIP-chip studies revealed specific hybrid-site biases in 
the RNase H, Senl and THO complex subunit Hprl mutants. 



These genome-wide analyses enhance our understanding of 
DNAiRNA hybrid-forming regions in vivo, highlight the role of 
cellular RNA processing activities in suppressing hybrid formation, 
and implicate DNA:RNA hybrids in control of a subset of 
antisense regulated loci. 

Results 

The genomic distribution of DNA:RNA hybrids 

DNA:RNA hybrids have been previously immunoprecipitated 
at specific genomic sites such as rDNA, selected endogenous loci, 
and reporter constructs [2,5]. Subsequendy, DRIP coupled with 
deep sequencing in human cells has demonstrated the prevalence 
of R loops at CpG island promoters with high GC skew [26] . To 
investigate the global profile of DNA:RNA hybrid prone loci in a 
tractable model, we performed genome-wide DRIP-chip analysis 
of wild type S. cerevisiae (ArrayExpress E-MTAB-2388) using the 
S9.6 monoclonal antibody which specifically binds DNA:RNA 
hybrids, as characterized previously [43,44]. DRIP-chip profiles 
were generated in duplicate (spearman's p = 0.78 when comparing 
each of over 2 million probes after normalization and data 
smoothing. Supplementary Figure SI) and normalized to a no 
antibody control. 

Overall, our DRIP-chip profiles identified several previously 
reported DNA:RNA hybrid prone sites including the rDNA locus 
and telomeric repeat regions (Figure 1, Supplementary 
Tables SI, S2) [2,29-31]. DNA:RNA hybrids were also observed 
at 1217 open reading frames (ORFs) (containing greater than 50% 
of probes above the threshold of 1.5 and found in both wild type 
rephcates) (Supplementary Table S3). These were generally 
shorter in length than average (p = 4.29e~'^^), highly transcribed 
(Wilcoxon rank sum test p = 2.2 le~^), and had higher GC content 
(p = 2.52e"'^°) (Figure 2A, 2B and 2C, Supplementary Figure 

52) . Importandy, despite the correlation between DNAiRNA 
hybrid association and transcriptional frequency, the wild type 
DRIP-chip profiles compared to the localization profile of the 
RNA polymerase II subunit Rpb3 revealed very low correlation 
(p = 0.0097; [45]). This suggests that the DRIP-chip method was 
not unduly biased towards the short DNA:RNA hybrids that could 
theoretically have been captured within active transcription 
bubbles. Importantly, because genes with high GC content also 
have high transcriptional frequencies (Supplementary Figure 

53) , it is not clear from our findings whether GC content or 
transcriptional frequency contributed more to DNAiRNA hybrid 
forming potential. Furthermore, we observe that DNA:RNA 
hybrid prone loci do not encode for mRNA transcripts with 
particularly long half-lives (Supplementary Figure S2D), 
suggesting that the act of transcription is vital to DNA:RNA 
hybrid formation and supporting the notion of co-transcriptional 
hybrid formation as the major source of endogenous DNAiRNA 
hybrids. 

Our data also revealed DNA:RNA hybrids highly associated 
with Tyl and Ty2 subclasses of retrotransposons (Figure 2E, 
Supplementary Table S4). Consistent with our findings at 
ORFs, the levels of DNA:RNA hybrids correspond well with the 
known levels of expression of these elements. In general, Tyl 
which constitutes one of the most abundant transcripts in the cell 
has the highest levels of DNA:RNA hybrids. Ty3 and Ty4 that are 
only slightly expressed have much lower levels of hybrids, and the 
lone Ty5 retrotransposon which is transcriptionally silent is not 
enriched for DNA:RNA hybrids (Figure 2E) ([46-48]). In 
contrast to the trends observed with ORFs, GC content in 
retrotransposons is not highly correlated with the levels of 
expression, suggesting that expression is the main contributor to 
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Figure 1. Genome-wide profile of DNA:RNA hybrids in wild type yeast revealed enrichment at rDNA, telomeres, retrotransposons 
and a subset of genes. DRIP-chip chromosome plot of DNA:RNA hybrids in the rDNA region and telomeric ends of chromosome XII. The black line 
represents the average of two wild type replicates. Bars indicate ORFs (grey), rDNA (purple), retrotransposons (green) or genes associated with an 
antisense transcript (red) [51,54]). Grey boxes delineate telomeric repeat regions. Y-axis indicates relative occupancy of DNA:RNA hybrids. X-axis 
indicates chromosomal coordinates. P indicates probability of observing a number of enriched features by random chance below what was observed 
(P>0.99997). 

doi:1 0.1 371 /journal.pgen.1 004288.g001 



DNAiRNA hybrid formation. Specifically, Ty3 retrotransposons 
have the highest GC content but have only modest levels of 
expression and DNAiRNA hybrids. 

DNA:RNA hybrids are significantly correlated with genes 
associated with antisense transcripts 

Certain DNA:RNA hybrid enriched regions identified by 
our DRIP-chip analysis such as rDNA and retrotransposons 
are associated with antisense transcripts [49,50]. Therefore, we 
checked if this was a common feature of DNA:RNA prone sites 
by comparing our list of DNA:RNA prone loci to a list 
of antisense-associated genes ([51]). Because the expression of 
antisense-associated transcripts may be highly dependent on 
environmental conditions, we based our analysis on a list of 



transcripts identified in S288c yeast grown to mid-log phase in 
rich media which most closely mirrors the growth conditions of 
our cultures analyzed by DRIP-chip ([51]). DNAiRNA hybrid 
enriched genes significantly overlapped with antisense-associ- 
ated genes, suggesting that DNAiRNA hybrids may play a role 
in antisense transcript-mediated regulation of gene expression 
(Fisher's exact te.st p=1.03e"'^) (Figure 3A, 3B and 3C, 
Supplementary Table S5). 

RNase H overexpression reduces detectable levels of 
DNAiRNA hybrids in cytological screens and suppresses genomic 
instability associated with R loop formation presumably through 
the de,gradation of DNA:RNA hybrids [7,52,53]. To test for a 
potential role of DNA:RNA hybrids in antisense-mediated gene 
regulation, we performed gene expression microarray analysis of 
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Figure 2. DNA:RNA hybrids are enriched at protein-encoding genes and retrotransposons of higher transcriptional frequency. (A) 

Average gene profile of DNAiRNA hybrids at ORFs enriched for DNA:RNA hybrids under wild type conditions. (B-D) CHROIVIATRA plots of DNA:RNA 
hybrid distribution along genes sorted by their length (B), grouped into five transcriptional frequency categories as per [69]) (C) or grouped into four 
GC content categories (D). Genes were aligned by their TSSs. (E) The average DNA:RNA hybrid score at Tyl, Ty2, Ty3, Ty4 and Ty5 retrotransposons in 
the left panel shows higher enrichment at Tyl and Ty2 retrotransposons. The average profile of DNA:RNA hybrids at all retrotransposons under wild 
type conditions is shown in the right panel. 
doi:1 0.1 371 /journal.pgen.1 004288.g002 
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Figure 3. Genes associated with DNA:RNA hybrids were significantly associated with antisense transcripts. (A) Antisense association of 
DNA:RNA hybrid-enriched genes in wild type. The p-value indicates significant enrichment (Fisher's exact test) of antisense-associated genes among 
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DNA:RNA hybrid-enriched genes compared to the Yassour et al. 2010 antisense-annotated dataset ([51]). (B) CHROMATRA plots of DNA:RNA hybrid 
distribution along genes sorted by their length and separated by whether they are antisense associated or not. Genes were aligned by their TSSs. (C) 
Average gene profile of DNAiRNA hybrids at genes associated with antisense transcripts. (D) Genes with increased mRNA levels upon RNase H 
overexpression were significantly associated with antisense transcripts compared to all transcripts represented by the microarray. (E) Antisense- 
associated DNA:RNA hybrid-enriched genes in wild type have lower transcription frequency compared to non-antisense-associated DNA:RNA hybrid- 
enriched genes. Genes up-regulated at the transcript level by RNase H overexpression have lower transcription frequency compared to all genes on 
the expression microarray. Intervals indicate range of the 95% of genes closest to the average in each sample. Averages stated above each bar. P 
values indicate significant decrease in transcriptional frequency (Wilcoxon rank sum test). (F) Overlap between DNA:RNA hybrid-enriched genes and 
RNase H-modulated transcripts sorted by antisense association according to the Yassour et al. 2010 database. For genes that are both hybrid- 
enriched and modulated at the transcript level by RNase H overexpression, the antisense association (100%) is significantly higher (Fisher's exact test 
p<2.2e~'*) than those of the parent datasets (37.4% for DNA:RNA hybrid-enriched genes, 43.9% for RNase H-modulated genes). 
doi:1 0.1 371/journal.pgen.1 004288.g003 



an RNa.se H overexpression strain compared to an empt)' vc-ctor 
control (GEO GSE46652). This identified genes that had 
increased mRNA levels (upregulated n = 212) or decreased mRNA 
levels (downregulated n = 88) as a result of RNase H overexpres- 
sion. A significant portion of the genes with increased mRNA 
levels were antisense-associated (Fisher exact test p = 2.9e 
(Figure 3D, Supplementary Table S5) and tended to have 
high GC content, similar to DNA:RNA hybrid enriched genes in 
wild type (Supplementary Figure S4). However, the genes with 
increased mRNA levels under RNase H overexpression and the 
antisense-associated genes enriched for DNAiRNA hybrids in our 
DRIP experiment both tended towards lower transcriptional 
frequencies (Figure 3E). These findings suggest that antisense- 
associated DNA:RNA hybrids moderate the levels of gene 
expression. Indeed, genes that were both modulated by RNase 
H overexpression and (-nrichcd for DNA:RNA hybrids were all 
found to be antisense-associated (Figure 3F). 

The mechanism underlying altered gene expression in cells 
overexpressing RNase H remains unclear. While the association 
with antisense transcription is compelling, alternative models exist. 
One possibility is that the stress of RNase H overexpression 
triggers gene expression programs that coincidentally are antisense 
regulated. We analyzed gene ontology (GO) terms enriched 
among genes whose expression was changed by RNase H 
overexpression. Consistent with previous work, genes for iron 
uptake and incorporation were strongly activated by RNase H 
overexpression (p = 2.21e~'^) (Figure 4A, Supplementary 
Table S6) and several of these iron transport genes (i.e. FRE4, 
FRE2, FRE3, FET3, FET4) are antisense-associated ([51,54]) 
suggesting that overexpression of RNase H activates transcription 
of these genes by perturbing antisense-mediated regulation. 
Alternatively, changes in RNase H levels may increase the cellular 
iron requirements since sensitivity to low iron concentration is 
associated with DNA damage and repair [55]. To test this 
alternative hypothesis, we tested the RNase H deletion and senl-1 
mutants for sensitivity to low iron conditions compared to a fetSJ 
positive control (Figure 4B). The senl-1 mutant, RNase H 
depletion or overexpression did not induce sensitivity to low iron 
ruling out the possibility that the transcriptional response in cells 
overexpressing RNase H was a result of cellular iron requirement. 
Collectively, our DRIP-chip and microarray analysis suggest that 
DNAiRNA hybrids may be an important player in antisense- 
mediated gene regulation. 

Cytological profiling of RNA processing mutants for R 
loop formation 

Transcription-coupled DNA:RNA hybrids have been shown to 
accumulate in a diverse set of transcription and RNA processing 
mutants involved in a wide range of transcription related processes 
(Table 1). To gain a broader understanding of factors involved in 
R loop formation, we performed a cytological screen of RNA 
processing, transcription and chromatin modification mutants for 



DNA:RNA hybrids using the S9.6 antibody. Importandy, previous 
work in our lab has shown that all of the mutants screened exhibit 
chromosome instability (CIN), which would be consistent with 
increased hybrid formation [53]. Significantly elevated hybrid 
levels were found in 22 of the 40 mutants tested compared to wild 
type, including a SUB2 mutant which has been previously linked to 
R loop formation (Figure 5, [4]). We also assayed some of the 
well-characterized R-loop forming mutants, RNase H, Senl and 
Hprl, as positive controls for elevated DNA:RNA hybrid levels 
(Figure 5). 

In our screen, we detected hybrids in mutants affecting several 
pathways linked to DNA:RNA hybrid formation such as 
transcription, nuclear export and the exosome (Figure 5, 
Table 1). Consistent with findings in metazoan cells, we also 
observed hybrid formation in some splicing mutants (Figure 5, 
Table 1; [56]). Several rRNA processing mutants were enriched 
for DNA:RNA hybrids (7 out of the 22 positive hits), likely due to 
DNA:RNA hybrid accumulation at rDNA genes, a sensitized 
hybrid formation site (Figure 1; [2]). It is possible that, as seen in 
mRNA cleavage and polyadenylation mutants, DNA:RNA hybrid 
formation may contribute to their CIN phenotypes [6] . Currently, 
there are 52 yeast genes whose disruptions have been found to lead 
to DNA:RNA hybrid accumulation, 21 of which were newly 
identified by our screen (Table 1). The success of this small-scale 
screen suggests that most RNA processing pathways suppress 
hybrid formation to some degree and that many DNA:RNA 
hybrid forming mutants remain undiscovered. 

DRIP-chip profiling of R loop forming mutants 

To better understand the mechanism by which cells regulate 
DNA:RNA hybrids, we performed DRIP-chip analysis of 
rnhlArnh201A, hprll!i, and senl-1 mutants in order to determine 
if these contribute differentially to the DNAiRNA hybrid genomic 
profile. The mhlAmh201A, hprll^, and .senl-1 mutants are 
particularly interesting because they have well established roles 
in the regulation of transcription dependent DNA:RNA hybrid 
formation. Our DRIP-chip profiles revealed that, similar to wild 
type profiles, the mutant profiles were enriched for DNA:RNA 
hybrids at rDNA, telomeres, and retrotransposons (Figure 6, 
Supplementary Tables SI, S2, S3). The mhlAmh201A, hprlt^, 
and senl-1 mutants also exhibited DNAiRNA hybrid enrichment 
in 1206, 1490 and 1424 ORFs respectively compared to the 1217 
DNA:RNA hybrid enriched ORFs identified in wild type 
(Supplementary Table S4). Interestingly, in addition to the 
similarities described above, our profiles also identified differential 
effects of the mutants on the levels of DNA:RNA hybrids. In 
particular, we observed that deletion of HPRl resulted in higher 
levels of DNA:RNA hybrids along the length of most ORFs with a 
preference for longer genes compared to wild type (Figure 7A, 7B 
and 7C). This observation is consistent with Hprl's role in 
bridging transcription elongation to mRNA export and its 
localization at actively transcribed genes ([4,57-59]). In contrast. 
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Figure 4. Pathways altered at the transcript level by RNase H overexpression. (A) Gene Ontology term network of genes with increased 
(left) or decreased (right) mRNA levels upon RNase H overexpression. Representative terms from Supplementary Table SI 0 are shown. Node size 
indicates fold enrichment. Node color indicates the number of genes associated with each term (the darkest indicating the greatest number of genes 
associated). Edge thickness indicates the number of genes shared between terms. (B) 10-fold serial dilutions on BPS iron plates testing low iron 
concentration sensitivity of wild type versus DNA:RNA hybrid forming mutants reveals a lack of cellular iron requirement in RNase H mutant strains. 
doi:1 0.1 371/journal.pgen.1 004288.g004 



mutating SK^l resulted in liiglier levels of DNAiRNA hybrids at 
shorter genes (Figure 7A and 7B), which is consistent with Sen I's 
role in transcription termination particularly for short protein- 
coding genes ([5,60,61]). The mhlAmh201A mutant revealed 
higher levels of DNA:RNA hybrids at highly transcribed and 
longer genes (Figure 7A and 7B) which is supported by a wealth 
of evidence of RNase H's role in suppressing R loops in long genes 
to prevent collisions between transcription and replication 
machineries ([8,62]). 

Further inspection of our profiles also revealed that 
mhlAmh201A and senl-1 mutants but not the hprlA mutant had 
increased DNAiRNA hybrids at tRNA genes (two tailed unpaired 
Wilcox test p=1.56e in the rnhlArnh201A mutant and 
1.68e"'-^ in the senl-1 mutant) (Figure 8A, 8B and 8C, 
Supplementary Table S7) and this was confirmed by DRIP- 
quantitative PGR (qPCR) of two tRNA genes in wild type and 
mhlAmh201A (Supplementary Figure S5). Because tRNAs are 
transcribed by RNA polymerase III, this observation indicates that 
Hprl is primarily involved in the regulation of RNA polymerase II 
specific DNA:RNA hybrids while RNase H and Senl have roles in 



a wider range of transcripts. Mutation of SEMI also led to 
increased levels DNA:RNA hybrids at snoRNA (two tailed 
unpaired Wilcox test p=1.81e"'^) (Figure 8D, 8E and 8F, 
Supplementary Table S8) consistent with its role in 3' end 
processing of .snoRNAs ([63]). 

Discussion 

The genomic profile of DNA:RNA hybrids 

Identifying the landscape of genomic loci predisposed to 
DNA:RNA hybrids is of fundamental importance to delineating 
mechanisms of hybrid formation and the contributions of various 
cellular pathways. Although our profiles depend on the specificity 
of the anti-DNA:RNA hybrid S9.6 monoclonal antibody, this 
aspect has been well characterized [44] and several of our 
observations are consistent with what has been reported in the 
literature. Locus specific tests showed that DNAiRNA hybrids 
occur more frequently at genes with high transcriptional frequency 
and GC content [4,5,18]. Moreover, in rnh201S. cells, there is an 
inverse relationship between GC content and gene expression 
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Table 1. List of yeast genes that affect DNA:RNA hybrid formation. 




Yeast cfene linked to DNAiRNA hybrid formation 


Reference 


Exosome and RNA degradation: DIS3, RRP6, XRNl 


This study, [7,22] 


Helicase: SEN), SRS2 


[5,9] 


mRNA cleavage and polyadenylation: CLPl, CFT2, FIPl, PCF11, RNA14, RNA15, TRF4 


[6,22,75] 


mRNA export: MEX67, MTR2, NAB2, NUP133, RNAl, SACS, SRMl, SUB2, SUSl, THPl, YRAl 


This study, [6,19,22,76,77] 


utner processes. tjLz, iv\t i, rjn /, ji j>i 


This study 


RNA Polymerase II transcription and chromatin modification: LEOl, MED12, MED13, MOTl, NPL3, 
RTTWB, SDS3, SIN3, SPT2, TAPS 


This study, [7,9,23,78] 


RNase H: RNH201, RNHl 


This study, [6,7] 


rRNA processing factors: DBP6, DBP7, IMP4, RPFI, SNU13, SNU66 


This study 


Splicing: MUD2, SNU114, PRP31, YHCl, SNU13, SNU66 


This study 


THO transcription elongation: TH02. HPRl, MFTI, THP2 


[6,18,58] 


Topoisomerase: TOPI 


[2] 


doi:1 0.1 371 /journal.pgen.l 004288.t001 



levels, suggesting that DNA:RNA hybrids accumulate at regions of 
high GC content and block transcription in the absence of RNase 
H [64] . Our work extends the knowledge of DNAiRNA hybrids 
from a few locus-specific observations to show that, in wild type, 
there are potentially hundreds of hybrid prone genes that tend to 
be shorter in length, frequently transcribed and high in GC 
content [2,4,56]. The latter is consistent with recent studies in 
human cells that demonstrated that genomic regions with high GC 
skew are prone to R loop formation, which plays a regulatory role 
in DNA methylation [26,27]. However, whUe we determined the 



relationship between GC content and DNA:RNA hybrid forma- 
tion, we were unable to do the same analysis for GC skew, likely 
due to the low level of GC skew and lack of DNA methylation in 
Saccharomjces. This is unsurprising since the best characterized 
functional element associated with GC skew, CpG island 
promoters [26,27], are not found in yeast. Importantly, our 
findings at retrotransposons support the notion that expression 
levels and not GC content contribute more to DNA:RNA hybrid 
forming potential. Additionally, DRIP-chip analysis of wild type 
cells identified hybrid enrichment at rDNA, retrotransposons, and 
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Figure 5. DNA:RNA hybrid cytological screen revealed high DNA:RNA hybrid levels in RNA processing and chromatin modification 
mutants. Asterisks indicate mutants with significantly increased levels of DNA:RNA hybrids compared to wild type (p<0.00024). Error bars indicate 
standard error of the mean. Representative chromosome spreads are shown: blue stain is DNA (DAPl) and the red foci are DNA:RNA hybrids. 
doi:1 0.1 371/journal.pgen.1 004288.g005 
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Figure 6. Genome-wide profiles of DNA:RNA hybrids in revealed similar enrichment of rDNA, retrotransposons and telomeres in 
wild type and mutants. DRIP-chip chromosome plot of DNA:RNA hybrids in wild type, rnh1Jrnh201J, hprU and sen1-l at chromosome XII. The 
average of two replicates per strain is shown. Bars indicate ORFs (grey), rDNA (purple), retrotransposons (green) or genes associated with an antisense 
transcript (red) [51,54]). Grey boxes delineate telomeric repeat regions. Y-axis indicates relative occupancy of DNA:RNA hybrids. X-axis indicates 
chromosomal coordinates. P indicates probability of observing a number of enriched features below what was observed (P>0.99997). 
doi:1 0.1 371 /journal.pgen.1 004288.g006 



telomeric regions. Along with previous studies, our DRIP-chip 
analysis confirms that rDNA is a hybrid prone genomic site and 
suggests that many factors of rRNA processing and ribosome 
assembly suppress potentially damaging rDNA:rRNA hybrid 
formation [2,7]. The presence of TERRA-DNA hybrids at 
telomeres is supported by our observation of significant 
hybrid signal at telomeric repeat regions across all DRIP-chip 
experiments. 

Antisense association of DNA:RNA hybrids 

The DRIP-chip dataset is a resource for future studies seeking to 
elucidate the localization of DNAiRNA hybrids across antisense- 
associated regions and the impact of DNA:RNA hybrid removal 
on genome-wide transcription. We observed that genes associated 
with antisense transcripts were significantly enriched for 



DNA:RNA hybrids and modulated at the transcript level by 
RNase H overexpression. Antisense regulation has been reported 
at mammalian rDNA and yeast Tyl retrotransposons, loci that 
were also enriched for DNA:RNA hybrids in our DRIP-chip 
[49,50]. The role of DNAiRNA hybrids and RNase H in antisense 
regulation is currendy unclear. However, there are several non- 
exclusive models of antisense gene regulation. One model proposes 
that the physical presence of the antisense transcripts is crucial to 
antisense gene regulation. For instance, trans-acting antisense 
transcripts have been shown to control Tyl retrotransposon 
transcription, reverse transcription and retrotransposition [65]. 
Another study has further shown that franj-acting antisense 
transcripts that only overlap with the sense strand promoter can 
block sense transcription, potentially by hybridizing with the non- 
template DNA strand [33]. These suggest that antisense 
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Figure 7. Mutant specific trends in protein-coding genes prone to DNA:RNA hybrid formation. (A-C) CMROMATRA plots of DNA:RNA 
hybrid distribution along genes sorted by their length (A) grouped into five transcriptional frequency categories as per [69] (B) or grouped into four 
GC content categories (C). Genes were aligned by their TSSs. 
doi:1 0.1 371/journal.pgen.1 004288.g007 
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Figure 8. RNase H and Sen1 mutants displayed elevated levels of DNA:RNA hybrids at tRNA and snoRNA genes. (A) Sample plot of 
relative DNA:RNA hybrid occupancy at a tRNA gene on chromosome X. For A and D, Colored lines represent the average enrichment of the indicated 
strains. Purple bars indicate the tRNA or snoRNA genes respectively and gray boxes represent ORFs. (B) Average profile of DNAiRNA hybrids at all 
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VII. (E) Average profile of DNA:RNA hybrids at all snoRNAs. (F) Average DNA:RNA hybrid score at each snoRNA. P indicates probability of observing a 
number of enriched features below what was observed (P>0.99997). 
doi:10.1371/journal.pgen.1004288.g008 



transcription in cis is not necessary as long as the antisense 
transcript is present. It is possible that DNA:RNA hybrids may be 
formed by the antisense or the sense transcript with genomic 
DNA. Moreover, DNA:RNA hybrids may play a functional role in 



antisense transcription regulation as shown by antisense-associated 
genes both enriched for DNA:RNA hybrids and affected 
transcriptionally by RNase H overexpression. Experiments com- 
paring the ratio of antisense versus sense transcripts and 
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determining the amount of DNA:RNA hybrid formation by either 
transcript under conditions known to regulate the particular gene 
will further elucidate the role of RNase H and DNA:RNA hybrids 
in antisense regulation. 

DRIP-chip analysis of hybrid-resolving mutants 

Our investigation of mutant-specific DNA:RNA hybrid forma- 
tion sites is consistent with the existing literature on Hprl, Senl 
and RNase H. Significantly, the hprlS. and rnhlArnh201A mutants 
exhibited increased DNAiRNA hybrid levels along the length of 
long genes, while the senl-1 mutant exhibited increased 
DNA:RNA hybrid levels along the length of short genes 
(Figure 7A). This coheres with Hprl's function in transcription 
elongation and mRNA export, and RNase H's role in preventing 
transcription apparatus and replication fork collisions, which carry 
greater consequence for long genes ([4,57—59,62]). In contrast, 
Senl is particularly important for transcription termination at 
short genes ([61]). 

In addition, the RNase H deletion and senl-1 mutants had 
increased hybrids at tRNA genes, suggesting that they are both 
required to prevent tRNA:DNA hybrid accumulation. Interest- 
ingly, a recent study found that the mRNA levels of genes 
encoding RNA polymerase III and proteins that modify tRNA are 
increased in an mhlAmh201A mutant [64], which may be in 
response to a lack of properly processed tRNA transcripts. The 
finding that both tRNA and snoRNA genes were enriched for 
hybrids in senl-1 highlights the role of Senl in RNA polymerase I, 
II and III transcription termination and transcript maturation 
[60,63,66] . More broadly, our data and the literature support the 
notion that transcripts from RNA polymerases I, II and 111 can be 
subject to DNA:RNA hybrid formation especially in RNA 
processing mutant backgrounds. 

Perspective 

Factors regulating ectopic, genome destabilizing DNA:RNA 
hybrids are best characterized in yeast, although less is known 
about the functions of native R loop structures. The genome-wide 
maps of DNA:RNA hybrids presented here recapitulate the known 
sites of hybrid formation but also add important new insights to 
potential functions of R loops. Most importantly, we demonstrate 
the usefulness of DRIP profiling for detecting biologically 
meaningful differences in mutant strains. Therefore, DRIP 
profiling of yeast genomes in various mutant backgrounds will 
be key to understanding the causes and consequences of 
inappropriate R loop formation and how these are modulated 
by other cellular pathways. 

Methods 

Strains and plasmids 

All strains are listed in Supplementary Table S9. For RNase 

H overexpression experiments, recombinant human RNase HI 
was expressed from plasmid p425-GPD-RNase HI (2^i, LEU2, 
GPDpr- RNase HI) and compared to an empty control plasmid 
p425-GPD (2n, LEU2, GPDpr) [7]. 

DRIP-chip and qPCR 

Briefly, cells were grown overnight, diluted to 0.15 ODroo and 
grown to 0.7 ODggo. Crosslinking was done with Wo formalde- 
hyde for 20 minutes. Chromatin was purified as described 
previously [67] and sonicated to yield approximately 500 bp 
fragments. 40 |xg of the anti-DNA:RNA hybrid monoclonal mouse 
antibody S9.6 (gift from Stephen Leppla) was coupled to 60 \xL 
of protein A magnetic beads (Invitrogen). For ChlP-qPCR, 



crosslinking reversal and DNA purification were followed by 
qPCR analysis of the immunoprecipitated and input DNA. DNA 
was analyzed using a Rotor-Gene 600 (Corbett Research) and 
PerfeCTa SYBR green FastMix (Quanta Biosciences). Samples 
were analyzed in triplicate on three independent DRIP samples 
for wild type and rnhlAmh201A. Primers are listed in Supple- 
mentary Table Sll. 

For DRIP-chip, precipitated DNA was amplified via two rounds 
of T7 RNA polymerase amplification ([68]), biotin labeled and 
hybridized to Afiymetrix l.OR S. cerevisiae microarrays. Samples 
were normaliz(;d to a no antibody control sample I'mock) using the 
rMAT software and relative occupancy scores were calculated for 
all probes using a 300 bp sliding window. AH profiles were 
generated in duplicate and replicates were quantile normalized 
and averaged. Spearman correlation scores between replicates are 
listed in Supplementary Table SIO. Coordinates of enriched 
regions are available in Dataset S1/S2/S3/S4/S5/S6/S7/ 
S8. DRIP-chip data is available at ArrayExpress E-MTAB-2388. 

DRIP-chip analysis 

Enriched features had at least 50% of the probes contained in 
the feature above the threshold of 1.5. Only features enriched in 
both replicates were reported. Transcriptional frequency [69], GC 
content ([70]) and gene length were compared using the Wilcoxon 
rank sum test. Antisense association was analyzed by the Fisher's 
exact test using R. Statistical analysis of genomic feature 
enrichment was performed using a Monte Carlo simulation, 
which randomly generates start positions for the particular set of 
features and calculates the proportion of that feature that would be 
enri{hcd in a given DRIP-chip profile if the feature were 
distributed at random [67]. 500 simulations were run per feature 
for each DRIP-chip replicate to obtain mean and standard 
deviation values. These values were used to calculate the 
cumulative probability (P) on a normal distribution of seeing a 
score lower than the observed value by chance. 

DRIP-chip visualization 

CHROMATRA plots were generated as described previously 
([71]). Relative occupancy scores for each transcript were binned 
into segments of 150 bp. Transcripts were sorted by their length, 
transcriptional frequency or GC content and aligned by their 
Transcription Start Sites (TSS). For transcriptional frequency 
transcripts were grouped into five classes according to their 
transcriptional frequency described by Holstege et al 1998. For GC 
content transcripts were grouped into four classes according to 
their GC content obtained from BioMart ([70]). Average gene, 
tRNA or snoRNA profiles were generated by averaging all the 
probes that were encompassed by the features of interest. For 
averaging ORFs, corresponding probes were split into 40 bins 
while 1500 bp of UTRs and their probes were spfit into 20 bins. 
For smaller features like tRNAs and snoRNAs corresponding 
probes were split into only 3 bins. Average cnriclimc'nt scores were 
calculated using in house scripts that average the score of all the 
probes encompassed by the feature. 

Gene expression microarray 

Gene expression microarray data is available at GEO 
GSE46652. Strains harboring the RNase HI over-expression 
plasmid or empty vector were grown in SC-Leucine at 30°C. All 
profiles were generated in duplicate. Total RNA was isolated from 
1 ODgoo of yeast cells using a RiboPure Yeast kit (A&B Applied 
Biosystem.s), amplified, labeled, fragmented using a Message-Amp 
III RNA Amplification Kit (A&B Applied Biosystems) and 
hybridized to a GeneChIP Yeast Genome 2.0 microarray using 
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the GeneChip Hybridization, Wash, and Stain Kit (Aflymetrix). 
Arrays were scanned by the Gene Chip Scanner 3000 7G and 
expression data was extracted using Expression Console Software 
(AfFymetrix) with the N1AS5.0 statistical algorithm. All arrays were 
scaled to a median target intensity of 500. A minimum cut off of p- 
value of 0.05 and signal strength of 100 across all samples were 
implemented and only transcripts that had over a 2-fold change in 
the RNase H over-expression strain compared to wild type were 
considered significant. The correlation between duplicate biolog- 
ical samples was: control (r = 0.9955), RNase H over-expression 
(r = 0.9719). For statistical analysis, GC content, transcription 
frequencies and antisense association were analyzed as for DRIP- 
chip analysis. 

Yeast chromosome spreads 

Cells were grown to mid-log phase in YEPD rich media at 
30°C and washed in spheroplasting solution (1.2 M sorbitol, 0.1 
M potassium phosphate, 0.5 M MgClj, pH 7) and digested in 
spheroplasting solution with 10 mM DTT and 150|ig/mL 
Zymolase 20T at 37°C for 20 minutes similar to previously 
described ([72]). The digestion was halted by addition of ice- 
cold stop solution (0.1 M MES, 1 M sorbital, 1 mM EDTA, 
0.5 mM MgCl2, pH 6.4) and spheroplasts were lysed with 1% 
vol/vol Lipsol and fixed on slides using 4% wt/vol paraformal- 
dehyde/3.4% wt/vol sucrose ([73]). Chromosome spread slides 
were incubated with the mouse monoclonal antibody S9.6 
(1 ng/mL in blocking buffer of 5% BSA, 0.2% mUk and 1 x 
PBS). The slides were further incubated with a secondary Gy3- 
conjugated goat anti-mouse antibody (Jackson Laboratories, 
#115-165-003, diluted 1:1000 in blocking buffer). For each 
replicate, at least 100 nuclei were visualized and manually 
counted to obtain the fraction with detectable DNA:RNA 
hybrids. Each mutant was assayed in triplicate. Mutants were 
compared to wild type by the Fisher's exact test. To correct for 
multiple hypothesis testing, we implemented a cut off of p<0.01 
divided by the total number of mutants compared to wild type, 
meaning mutants with p<0. 00024 were considered significantly 
different from wild type. 

BPS sensitivity assay 

10-fold serial dilutions of each strain was spotted on 90 |J.M BPS 
plates with FeS04 concentrations of 0. 2.5, 20 or 100 |J,M and 
grown at 30°C for 3 days [55]. 

A summary of this paper was presented at the 26* International 
Conference on Yeast Genetics and Molecular Biology, August 
2013 [74]. 

Supporting Information 

Dataset SI Wild type replicate 1 enriched region coordinates. 

(XLSX) 

Dataset S2 Wild type replicate 2 enriched region coordinates. 
PCLSX) 

Dataset S3 senl-1 replicate 1 enriched region coordinates. 

(XLSX) 

Dataset S4 senl-1 replicate 2 enriched region coordinates. 
PCLSX) 

Dataset S5 hprU replicate 1 enriched region coordinates. 

(XLSX) 

Dataset S6 hprlA rephcate 2 enriched region coordinates. 
PCLSX) 



Dataset S7 mhlA mh201A replicate 1 enriched region coordi- 
nates. 
(XLSX) 

Dataset S8 mhlA mh201A replicate 2 enriched region coordi- 
nates. 
(XLSX) 

Figure SI Spearman correlation scatter plot of wild type 
rephcates. 

(EPS) 

Figure S2 Box plots comparing the distribution of (A) gene 
length, (B) transcription frequency, (C) GC content, and (D) 
mRNA transcript half-life of ORFs enriched for DNA:RNA 
hybrids versus ORFs not enriched for DNA:RNA hybrids. 
The p values calculated by the Wilcoxon rank sum test are 
shown. 
(EPS) 

Figure S3 (A) Distribution of % GC content of all ORFs sorted 
by transcriptional frequency. (B) Distribution of % GC content of 
ORFs enriched for DNA:RNA hybrids in WT sorted by 
transcriptional frequency. Intervals indicate 95% confidence 
intervals. 
(EPS) 

Figure S4 Distribution of % GC content of all genes represented 
on the expression microarray (n = 5657), and transcripts up- 
(n = 212) or down-regulated (n = 88) by RNase H overexpression. 
Intervals indicate 95% confidence intervals. Averages are stated 
above each sample. The p-value indicates a significant increase in 
GC content of upregulated genes compared to all microarray 
transcripts (Wilcoxon rank sum test). 
(EPS) 

Figure S5 Relative quantities of (A) SUF2 tRNA gene and (B) 
tV(UAC)D tRNA gene detected in WT or mhlAmh201A as 
detected by DRIP-quantitative PGR (qPCR). Error bars indicate 
standard deviation. 
(EPS) 

Table SI List of rDNA enriched for RNA:DNA hybrids in wild 

type, mhlAmh201A, hprlA and senl-1. 
(XLSX) 

Table S2 List of telomeric repeat regions enriched for 
RNA:DNA hybrids in wild type, mhlAmh201A, hprlA and senl-1. 
PCLSX) 

Table S3 List of ORFs enriched for RNA:DNA hybrids in wild 

type, rnhlArnh201A, hprlA and senl-1. 

(XLSX) 

Table S4 List of retrotransposons enriched for RNA:DNA 
hybrids in wild type, mhlAmh201A, hprlA and senl-1. 
PCLSX) 

Table S5 Lists of open reading frames (ORFs) and antisense- 
associated ORFs enriched for RNA:DNA hybrids in wild type or 
modulated at the transcript level by RNase H overexpression. 
(XLSX) 

Table S6 GO function sorting of genes modulated at the 
transcript level by RNase H overexpression. 

po^sx) 

Table S7 List of tRNA genes enriched for RNA:DNA hybrids in 

wild type, mhlAmh201A, hprlA and senl-1. 

PCLSX) 



PLOS Genetics | www.plosgenetics.org 



13 



April 2014 I Volume 10 | Issue 4 | e1004288 



Genome-Wide Profiling of Yeast DNA:RNA Hybrids 



Table S8 List of snoRNA genes enriched for RNA:DNA hybrids 

in wild type, mhlAmh201A^ hprlA and senl-L 

(XLSX) 

Table S9 Strains used in this study. 

fKLSX) 

Table SIO Spearman correlation scores. 

po^sx) 

Table Sll DRIP-qPGR primers. 
(XLSX) 
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